Molecular characterization of Streptococcus pyogenes (StrepA) non-invasive isolates during the 2022–2023 UK upsurge

Abstract At the end of 2022 into early 2023, the UK Health Security Agency reported unusually high levels of scarlet fever and invasive disease caused by Streptococcus pyogenes (StrepA or group A Streptococcus). During this time, we collected and genome-sequenced 341 non-invasive throat and skin S. pyogenes isolates identified during routine clinical diagnostic testing in Sheffield, a large UK city. We compared the data with that obtained from a similar collection of 165 isolates from 2016 to 2017. Numbers of throat-associated isolates collected peaked in early December 2022, reflecting the national scarlet fever upsurge, while skin infections peaked later in December. The most common emm-types in 2022–2023 were emm1 (28.7 %), emm12 (24.9 %) and emm22 (7.7 %) in throat and emm1 (22 %), emm12 (10 %), emm76 (18 %) and emm49 (7 %) in skin. While all emm1 isolates were the M1UK lineage, the comparison with 2016–2017 revealed diverse lineages in other emm-types, including emm12, and emergent lineages within other types including a new acapsular emm75 lineage, demonstrating that the upsurge was not completely driven by a single genotype. The analysis of the capsule locus predicted that only 51 % of throat isolates would produce capsule compared with 78% of skin isolates. Ninety per cent of throat isolates were also predicted to have high NADase and streptolysin O (SLO) expression, based on the promoter sequence, compared with only 56% of skin isolates. Our study has highlighted the value in analysis of non-invasive isolates to characterize tissue tropisms, as well as changing strain diversity and emerging genomic features which may have implications for spillover into invasive disease and future S. pyogenes upsurges.


DATA SummARy
All new genome sequence data is available on the NCBI short read archive under the bioproject PRJNA1062601, and individual accession numbers are listed in Tables S1 and S2, available in the online Supplementary Material (upplementary Materials 1 and 2).

InTRoDuCTIon
The human pathogen Streptococcus pyogenes, also known as group A Streptococcus or StrepA, is a common cause of throat infections, such as pharyngitis and tonsillitis, as well as mild skin infections such as pyoderma or impetigo.More commonly in children than adults, throat infections can progress to scarlet fever, with a characteristic sandpaper-like rash and 'strawberry tongue' .On rare occasions, S. pyogenes can also cause severe and potentially lethal invasive diseases, such as pneumonia, empyema, bacteraemia and necrotizing fasciitis ('flesh-eating' disease).In England, scarlet fever and invasive S. pyogenes/group A Streptococcus (iGAS) disease cases are notifiable to the UK Health Security Agency (UKHSA).
In September 2022, the UKHSA reported an unusually high level of scarlet fever notifications with ~3.7-fold more than in the same period for the previous five seasons [1].Notifications continued to rapidly increase across England and Wales, with 8688 cases in weeks 37-48 (mid-September to end-November), compared with 333-2536 in the previous five seasons [2].Alongside the increasing scarlet fever cases, there were also high numbers of iGAS notifications with 772 in weeks 37-48.Concerningly, during these weeks, there were more cases of iGAS in children under 15 (26.1%), compared with previous seasons (6.4-13.3%),and 14 deaths in this age group [2].Scarlet fever notifications peaked in week 49 with 10 069 cases, and iGAS notifications peaked in week 52 with 213 cases [3].Typing of isolates by the sequence of the emm gene, which encodes for the hypervariable M protein, identified emm1 as the most common cause of iGAS in those older than 15 (31 %) but an even higher proportion of cases in those younger than 15 (57 %).emm12 and emm4 were the second (23 %) and third (7 %) most common causes of iGAS in children, at least in the early part of the upsurge [2].
Scarlet fever demonstrates seasonality with cases typically increasing in late winter and peaking in early spring.An unexpected increase in scarlet fever cases in England was first seen in 2013-2014, peaking in early April and totalling over 13 000 notifications, compared with fewer than 3000 in previous seasons [4,5].Notifications remained seasonally elevated, rising each year to their highest in 2017-2018 [6], until the COVID-19 (coronovirus disease 2019) pandemic when cases fell dramatically.No corresponding rise was observed for iGAS notifications until the 2015-2016 season.The link between scarlet fever cases and iGAS cases is not well understood, but it appeared that the increase in both scarlet fever and invasive disease notifications in early 2016 was due to the emergence of a new variant of a lineage carrying emm1 termed M1 UK [7,8], which, prior to 2022-2023, also led to the biggest upsurge in cases in 2017-2018 [6].The M1 UK variant had steadily increased in prevalence in England since 2010 and represented 91.5 % of invasive emm1 by 2020 [8].M1 UK is characterized by 27 SNPs in comparison to a globally circulating emm1 population (M1 global ) and increased expression of the superantigen SpeA [7,9,10].The M1 UK lineage has been detected in European countries, North America, Australia and New Zealand, often associated with increases in disease [7,9,[11][12][13][14][15][16][17].
While iGAS undergoes routine surveillance in several high-income countries, with the inclusion of scarlet fever and outbreak situations in some, these types of infections only represent a small proportion of streptococcal cases.Our knowledge of circulating non-invasive disease (from non-sterile sites) isolates is severely lacking, yet these isolates may act as an early indicator for increasing prevalence of new genotypes or lineages that could lead to more severe infections.This lack of knowledge also means we have a limited understanding of the connection between certain emm-types or lineages and preferences for causing throat infections or skin infections.As non-invasive isolates are not routinely collected, data acquisition relies on local collections.Sheffield is a large northern city in England with a population of around 600 000.During the 2022-2023 upsurge, the region of Yorkshire and the Humber, which includes Sheffield, reported the highest rates of invasive S. pyogenes in England, at 8.7 per 100 000 population [3].Scarlet fever notifications were also high in the region, at 132.0 per 100 000 population, although this was similar to the rate seen in the North West region and lower than the East Midlands region [3].We began in November 2022 to routinely save all non-invasive isolates identified by the Department of Laboratory Medicine at the Northern General Hospital, Sheffield, which performs microbiological services for community care as well as surrounding hospitals.We had also previously performed a similar collection in 2016-2017 which we used as a comparative population.We undertook whole genome sequencing (WGS) analysis of both collections and characterized strain diversity and pathogenicity factors.As expected, the emm1 M1 UK lineage dominated during the 2022-2023 upsurge followed by emm12, but there were some unexpected increases in other emm-types as well as differences between throat-associated isolates and skin-associated isolates.

Isolate collection
A total of 384 non-invasive isolates (from non-sterile sites), presumptively identified from culture as S. pyogenes, were collected from the Department of Laboratory Medicine, Northern General Hospital, Sheffield, UK, between November 2022 (week 45) and February 2023 (week 6).The Department of Laboratory Medicine performs microbiology diagnostics for NHS Trusts as well as primary care and community services, acting as the single regional diagnostic microbiology laboratory for a population of around 600 000 adults and children in Sheffield.Anonymized clinical data were collected for each isolate from the sample request information and electronic clinical patient records: swab source (throat/skin/ear/eye/nose), sampling date, age, sex and infection type.Cases were considered to be associated with scarlet fever where scarlet fever was queried by the clinician or a rash consistent with a scarlet fever diagnosis was described in the clinical details accompanying the request.All other throat samples, and other sample types, were deemed non-scarlet fever samples.Samples from ear, eye and nose isolates were excluded from further downstream analyses to focus the comparison on throat and skin isolates.
For comparison, a further 229 archived non-invasive S. pyogenes isolates collected in a similar manner between October 2016 and January 2017 were also included in this study, with data collected as above.The 2016-2017 season also showed a seasonal upsurge in disease compared with previous years, however with substantially fewer scarlet fever and invasive S. pyogenes cases reported compared with 2022-2023.The age distribution of cases in 2016-2017, of both scarlet fever and invasive S. pyogenes disease, was consistent with previous years, and this was therefore considered a suitably representative comparative sample.

Whole genome sequencing
Genomic DNA was extracted from all isolates using a previously described method [18].Genomic DNA from isolates collected in 2022-2023 underwent WGS at Earlham Institute by the Genomics Pipeline group, using the LITE protocol [19] for library preparation and sequenced on a NovaSeq X plus generating 150 bp paired-end reads.Genomic DNA from isolates collected in 2016 underwent sequencing provided by MicrobesNG (https://microbesng.com) using the Nextera XT library prep kit (Illumina) and the Illumina HiSeq 2500, generating 250 bp paired-end reads.
Trimmed and subsampled reads were then used to perform de novo assembly using SPAdes (v.3.13.1) with k-mer sizes of 21, 33, 55 and 77 [21].Assembly statistics were generated for each isolate using Quast [22] (Tables S1 and S2), and any draft assemblies with more than 500 contigs or a total genome size greater than 2.2 Mb were excluded from downstream analysis, as were any that were determined were not to be S. pyogenes based on their genomic sequences.MLST and emm-types were determined from the de novo assemblies using mlst (https://github.com/tseemann/mlst)with the pubmlst database [23] and the emm_ typer.pl script ( github.com/ BenJamesMetcalf/ GAS_ Scripts_ Reference), respectively.New MLSTs were submitted to pubmlst and new emm-types and sub-types to the CDC emm-type database (https://cdc.gov/streplab)for assignment.
De novo assemblies were annotated using Prokka (v.1.14.6) [24].Snippy (https://github.com/tseemann/snippy)was used to determine SNP distances between sequence reads and a reference genome.RAxML v8.2.12 [25] was used to generate maximum likelihood phylogenetic trees based on the core gene alignment with a general time-reversible substitution model and 100 bootstraps.Known regions of recombination (prophage regions and other mobile genetic elements) were excluded from reference genomes prior to mapping.Where these regions were unknown in the emm22 reference genome, regions of predicted recombination were identified and removed using Gubbins [26] prior to tree construction.Phylogenetic trees were annotated using iTOL (version 6) [27].
Further comparison was made with previously published WGS data from Sheffield, consisting of 142 non-invasive skin and soft tissue isolates collected in 2019 [28], and national and international data from other published collections (Table S3).

Collected isolates
During the unusual S. pyogenes infection upsurge period reported by the UKHSA in England in 2022-2023, we collected a total of 384 non-invasive isolates from the Department of Laboratory Medicine, Sheffield.This collection was compared with 229 isolates collected in 2016-2017, representing a time of more typical seasonal upsurge.The proportion of throat isolates was similar for both time periods (61.7% and 65.1 %), but during 2022-2023, overall isolate numbers were much higher (Table 1).Significantly more throat isolates were also associated with scarlet fever in 2022-2023 than in 2016-2017 [24.5 % vs 5.4 %, χ 2 (1) = 23.55,P<0.0001].For both time periods, throat isolates were primarily from children under 9, but in 2022-2023, significantly more were from children aged 5-9 compared with 2016-2017 [42.6 % vs 19.5 %, χ 2 (1) = 21.96,P<0.0001].There were fewer skin than throat isolates for both time periods, and skin isolates were predominantly from those aged 1 year and under and those aged 45 and older.There was evidence in both time periods of a bimodal age distribution with the highest number of isolates from children under 9 years of age and a second peak in individuals aged 30-39 years (Fig. S1). S. pyogenes was more frequently isolated from throat swabs in female patients [ (77.0 %) from throat swabs and 38 (23.0 %) from skin swabs.We purposely did not sequence isolates from 'other' sites in this earlier collection.
The emm-type for each isolate was extracted from the WGS data.The frequency of isolates and the distribution of emm-types varied over time in the 2022-2023 collection, with the total number peaking in week 49, in keeping with UKHSA data [3].We observed a fall in the number of throat swabs following the issue of interim clinical guidance by NHS England on the 9th of December 2022 (week 49); this guidance advised clinicians to lower their threshold to empirically treat children with sore throats, including when their presentation may be secondary to viral respiratory illness.Across all 2022-2023 isolates, a total of 33 different emm-types were identified, with emm1 being the most common at 95/341 (27.9 %) followed by emm12, at 64/341 (18.8 %) (Fig. 1a).This high level of emm1 and emm12 cases was reflected in an increase in the number of throat samples in late November to early December 2022 (Fig. 2).Within our comparative collection from 2016-2017, 28 different emm-types were identified overall, most frequently emm89 (30/165, 18.2 %).The 2022 upsurge in throat disease was followed by a surge in skin disease in late December 2022, with a second peak in late January 2023 (Fig. 2).This was driven by 23 different emm-types, most frequently emm1 (22/100, 22 %) and emm12 (10/100, 10 %), but also emm76 (18/100, 18 %) and emm49 (7/100,7 %), both of which were rarely found in the throat.This differed from our 2016-2017 collection, for which there were 17 different emm-types but emm43 (31.6 %, 12/38) was dominant, followed by emm89 (15.8 %, 6/38) and emm28 (10.5 %, 4/38) (Fig. 1a).Only a single skin isolate was emm1, and none were emm12 or emm76.Each isolate was also assigned an emm-cluster, a functional classification grouping closely related M proteins that share binding and structural properties (Fig. S2).Within the 2022-2023 collection, the most common was the emm1 cluster type A-C3 (27.9 %, 95/341) and the emm12 cluster type A-C4 (18.8 %, 64/341).E4 was also common (19.6 %, 67/341), representing emm22, emm77, emm89 and emm102.By comparison, emm-cluster E4 made up the largest proportion of isolates within the 2016-2017 collection (28.5 %, 47/165), reflecting the high number of emm89 isolates from both throat and skin samples.The 2016-2017 collection also had a number of samples in cluster D4 (9.7 %, 15/165), all of which were skin isolates and mostly emm43 (80 %, 12/15).

Antimicrobial resistance genes
The presence of antimicrobial resistance genes in our 2022-2023 throat and skin isolate genomes was relatively low overall (Table S1), with only 29.4 % (91/309) carrying at least one gene.By far the most common was tetM at 24.3 % (75/309) and 28.5 % of isolates carried at least one tet gene (tetL, tetM, tetO or tetT).The second most common resistance gene was ermA (4.9 %).No resistance genes were identified in emm1, and only three emm12 carried resistance genes (mefA and msrD encoding macrolide resistance).Fewer 2016-2017 throat and skin isolate genomes carried at least one resistance gene (18.2%, 30/165) (Table S2), but all of these included a tet gene (tetM or tetO), with tetM being the most common at 15.8 % (26/165).

Hyaluronic acid capsule synthesis and the nga-ifs-slo toxin loci
We previously identified an increasing number of emm-types that had undergone recent genetic changes leading to the inability to produce the hyaluronic acid capsule, through loss or nonsense mutation of capsule synthesis genes hasABC [31].Forty per cent of our Sheffield 2022-2023 throat and skin isolates were predicted to be unable to synthesize the hyaluronic acid capsule due to nonsense mutations in or the absence of the hasA, hasB and hasC genes.While sporadic nonsense mutations occurred in hasA or hasB in some emm-types, such as emm1 (4.9 %) and emm12 (14.5 %), 100 % of all isolates belonging to 11 different emm-types were predicted to be acapsular (Table S1).This included emm22 and emm89, for which, as expected, the entire hasABC locus was absent, emm28, emm77 and emm87 which had previously described nonsense mutations in hasA [31,32], and all emm9, emm29, emm58, emm81, emm90 and emm94 isolates.Typically, emm4 also lacked the hasABC locus although one isolate was found to carry it.Nonsense hasA mutations were also found in the majority of emm75 (78.5 %) and emm11 (66.7 %).
Only 51 % of 2022-2023 throat isolates were predicted to be able to produce capsule compared with 78 % of 2022-2023 skin isolates.Loss of capsule was predominantly associated with emm-pattern E isolates, for which only 27 % were predicted to be encapsulated compared with 91 and 92 % of patterns A-C and D, respectively.This was similar in the 2016-2017 isolates, where only 30 % of pattern E isolates were predicted to be able to make capsule (Table S2).
We previously identified convergent evolution with acapsular isolates also having undergone homologous recombination resulting in increased expression of the toxins NADase and streptolysin O (SLO) [31,33,34].High or low expression of these toxins can be linked to three residues in the promoter region of the nga (encoding for NADase), ifs (encoding the inhibitor of NADase) and slo locus [34].Within the 2022-2023 throat isolates, 90 % were predicted to have high-toxin expression (as defined previously [31]), compared with 56 % of skin isolates (Table S1).Only 8 % of pattern D isolates were predicted to have high-toxin expression, compared with 99 % of A-C and 66 % of E. Additionally, only 58 % of pattern D isolates would express active NADase, based on a glycine residue at codon 330 rather than an aspartate [35], compared with 99% and 100 % of pattern A-C and E isolates, respectively.Although emm1 and emm12 isolates were predominantly encapsulated with a high-toxin expression genotype, an acapsular with a high-toxin expression genotype was found in 41 % of throat isolates compared with a significantly lower 22 % of skin isolates [χ 2 (1) = 10.91,P=0.001].Only 2 % of throat isolates were predicted to be encapsulated with low-toxin expression genotype compared with 44 % of skin isolates.This was similar in 2016-2017 with 8 % of throat isolates predicted to be encapsulated with a low-toxin expression genotype but 50 % of skin isolates (Table S2).Interestingly, the association of capsule and toxin expression genotype with infection site was maintained even within pattern E isolates.A significantly higher proportion of 2022-2023 pattern E isolates from the skin were encapsulated with a low toxin expression genotype compared with the pattern E throat isolates [58.9 % vs 3.  S3A).The other chromosomal superantigens were seen less frequently, with speJ in 42.4 % (131/309) and the co-transcribed speQ and speR in 12.6 % (39/309).Of the prophageassociated superantigens, speC was the most common at 52.4 % (162/309), followed by speA at 32 % (99/309).One emm49 skin isolate carried a speK/speM fusion gene which we previously identified in an emm65 [28].Of the 99 isolates that carried speA, 80.8 % (80/99) were emm1.
In comparison, a similar proportion of isolates from the 2016-2017 collection carried the most common superantigen genes, smeZ and speG, at 94.5 % (156/165) and 90.9 % (150/165), respectively (Fig. S3B).Prophage-associated speC was found in 64.S1 and S2), but the impact of these is difficult to determine, and many were associated with emm-type.

emm1
A total of 95 (27.9 %) of the 341 genomes collected in 2022-2023 were emm1.Sixty (63.2 %) of these were from throat isolates, of which 24 were associated with scarlet fever.Twenty-two (23.2 %) were skin isolates, and the remaining 13 were from other sites including 12 from ear swabs and one from an eye swab.All these emm1 isolate genomes clustered within the M1 UK lineage, and all carried the 27 lineage defining SNPs (Fig. 4).Similarly, for our 2016-2017 emm1 isolates, only one out of 17 emm1 isolates was not M1 UK .This is in keeping with the emergence of M1 UK as the dominant emm1 strain globally.Sheffield isolates were spread throughout the M1 UK phylogeny without evidence of expansion of a specific sub-clade.Two scarlet fever throat isolates lacked the speA-carrying Φ5005.1 phage.All other isolate genomes had speA; the chromosomal speG, speJ and smeZ; and a combination of other prophage-associated superantigen genes.No clear correlation was seen between clinical presentation and the presence of a particular profile of superantigens and/or DNases (Fig. S5).

emm12
A total of 64/341 (18.8 %) 2022-2023 isolate genomes were emm12, the majority of which were throat isolates (52/64, 81.3 %), and 11 of these were associated with scarlet fever.Ten were skin swab samples, and two were ear swabs.For the 2016-2017 collection, 12 out of 124 (9.7 %) genomes were emm12, and all were from a throat source.
The core genome phylogeny of global emm12 isolates showed four distinct clades (Fig. 5), as described previously [37].The majority of all Sheffield isolates were clade I or clade IV, alongside other UK and European isolates.A single 2022-2023 Sheffield isolate was clade II, and no 2022-2023 isolates were found within clade III.Our phylogeny also showed three sub-clades within clade IV, one of which was dominated by Sheffield, other UK and European isolates, while isolates from the USA were restricted to the other two sub-clades, one of which also included Asian strains.Other US isolates also dominated clade II.
presentation and the presence of a particular profile of superantigens and/or DNases.Three 2022-2023 throat isolates in clade I carried mefA and msrD genes, associated with acquired macrolide resistance.Interestingly, ten 2022-2023 isolates with the CovR A105G variant all clustered together within clade I; a single isolate within this cluster also carried a RocA variation (G184R).A separate cluster of four isolates in clade IV carried the same mutation in rocA leading to a premature stop codon after 427 aa.
The single emm82 isolates from 2022-2023, and all seven emm82 isolates from 2016-2017 had the same ST as emm12: ST36.These isolates were confirmed to be the lineage of emm82 recently identified to have arisen through recombination and emm-switching of emm12 to emm82 [38,39].

emm4
emm4 was associated with invasive disease in children <15 years early in the 2022-2023 upsurge [2] and was previously associated with the 2014 scarlet fever upsurge in England [4].Of the twelve 2022-2023 emm4 isolates, nine were throat isolates, including three associated with scarlet fever; two were skin isolates, and one eye swab isolate.One skin-associated isolate had a highly divergent core genome and uniquely was ST289 and emm-subtype 4.2 rather than ST39 and emm-subtype 4.0 or 4.19.It also carried the hasABC locus which was characteristically absent in the other 11 emm4 isolates.
Phylogeny of the other eleven 2022-2023 isolates within a wider emm4 population showed only a single strain clustering with the previously described 'Degraded' lineage [40] due to substantial loss of genes within the three prophages and the integrated conjugative element associated with emm4 (Fig. 6).This lineage also has a fusion of the 5′ of emm gene with the 3′ of the downstream enn gene [41].The five emm4.19 isolates clustered together within a 'Complete' sub-lineage, closely related to other UK isolates.The remaining five isolates clustered with the recently described invasive disease-associated M4 NL22 lineage from the Netherlands [42].This appears to be a recent emergence of this lineage in England as no emm4 isolates in our 2016-2017 comparative collection nor in our Sheffield 2019 isolates [28] were found clustering with M4 NL22 isolates.Our seven Sheffield 2016-2017 emm4 isolates included four within the 'Degraded' lineage, while three were 'Complete' .This split was similar in our 2019 Sheffield isolates with four isolates in each of these lineages.This pattern of an even divide of UK isolates between 'Complete' and 'Degraded' lineages is consistent with our phylogeny and previous findings [40], with the shift towards dominance of the 'Complete' lineage, at least in Sheffield, emerging during the 2022-2023 upsurge.
2022-2023 isolates, except two, together in expansion of a single lineage (Fig. 7).All Sheffield emm22 isolates carried a CovR V128A variant, clustering together in a lineage with 12 other isolates from the UK, Europe and the USA all also carrying the same variant.These isolates also carried tetM, alongside additional strains from the same parent lineage.All emm22 isolates, Sheffield and globally, carried the same variation in RocA (V333A) compared with a reference RocA sequence.

emm75
Within all 2022-2023 Sheffield isolates, 4.4 % were emm75 (15/341), of which ten were from a throat source and four of these associated with scarlet fever, four from skin and one from an ear swab.By comparison, in 2016-2017, 9.1 % of isolates were emm75 (15/165), all of which were throat isolates not associated with scarlet fever.All Sheffield emm75 isolates from both 2022-2023 and 2016-2017 were ST150, distinct from the other dominant lineage of emm75, ST49.Of the 2022-2023 isolates, 13/15 formed a new sub-lineage with identical superantigen profiles (Fig. 8).Within this sub-lineage, 12/13 had the same additional T in a 7 residue homopolymeric tract in hasA, leading to a premature stop codon after 46 amino acids, and therefore are likely to be acapsular.This is new compared with 2016-2017 Sheffield isolates, none of which carried a hasA mutation and were scattered throughout the phylogeny.Other acapsular emm75 strains with the same premature stop codon in hasA were identified in a small cluster of US isolates, and a single sporadic UK strain.
In 2022-2023, 5.6 % (19/341) of isolates were emm76, compared with none in 2016-2017.Of these, 18 were from a skin source and one from an ear swab, and all were ST378, with no mutations in hasA or hasB, and predicted low-toxin expression (Fig. S7).All carried tetM, as described previously within this lineage [31].This reflects a recent expansion of the encapsulated, low-toxin phenotype ST378 lineage, in contrast to the previously described ST50 lineage which was acapsular with high-toxin expression [31].
In 2022-2023, 4.1 % (14/341) of isolates were emm77, an increase from 2.4 % (4/165) in 2016-2017.Of the 2022-2023 isolates, 8 were from throat swabs and not associated with scarlet fever, 5 were from a skin source and 1 from an ear swab.The majority (11/14) were ST63, carrying resistance genes tetO and ermA, with the CovR M170I variation and belonged to the previously described acapsular (truncated HasA after 154 aa) with high-toxin expression lineage [31].The two other emm77 isolates were ST399, quite distinct from ST63 although also predicted to be acapsular (truncated HasA after 46 aa) but with low-toxin expression.

Invasive S. pyogenes isolates
During the time of our 2022-2023 collection, the Department of Laboratory Medicine, Sheffield, identified 19

DISCuSSIon
In recent years, the UK has experienced significant rises in morbidity and mortality in association with substantial upsurges in S. pyogenes infections, including in 2017-2018 and in 2022-2023 [3].Despite this, study of non-invasive isolates has been limited, and the capacity for dynamic changes in non-invasive disease to drive upsurges and invasive disease is poorly understood.We sequenced 341 non-invasive isolates from Sheffield during the major 2022-2023 UK upsurge, and although we found emm1 and emm12 to be the leading causes of both throat and skin infections, they were differentially followed by emm22, emm87 and emm89 in throat infections but emm76 and emm49 in skin.A comparison to non-upsurge isolates from 2016-2017 indicated emm1 and emm12 contributed significantly to the 2022-2023 upsurge, but other emm-types had also changed over time with more emm22 in throat infections and emm76 in skin infections.All 2022-2023 emm1 isolates were the prevalent M1 UK lineage, but more diverse lineages were identified in other emm-types, including emm12, and emergent lineages in others, including emm75, demonstrating that, at least local to Sheffield, the upsurge was not primarily caused by a single genotype.
For both time periods studied, non-invasive S. pyogenes infections were most common in those aged 0-9 years, but more infections were seen in The dominant peak in throat isolates at week 49 of 2022 within our collection, followed by a second smaller peak in week 3 of 2023, coincides with the peaks in scarlet fever notifications reported by the UKHSA during this time period [3].During week 49, NHS England group A Streptococcus interim clinical guidance summary for case management was issued with guidance temporarily altering the clinical scoring criteria threshold for immediate antibiotic treatment in children with a sore throat.The issue of this guidance on the 9th of December 2022 likely enhanced clinician confidence in the empirical diagnosis of S. pyogenes throat infection and therefore may have contributed to a fall in throat swabs being received by the diagnostic laboratory.However, this also coincides with the fall in scarlet fever reporting nationally [3].In contrast, the prominent peak in skin isolates occurred in week 52, between the peaks in throat isolates.This may be due to behavioural or environmental factors, such as increased skin-to-skin transmission of S. pyogenes associated with increased social mixing or altered chronic wound care at this time of year.
Analysis of national iGAS data from this time period identified a significant association between M1 UK and pleural isolates, likely the result of an early S. pyogenes upsurge coinciding with the respiratory virus season, facilitating disease progression [44].Nationally, emm1 and emm12 were the most common emm-types among invasive isolates in all age groups during this upsurge, notably with emm4 as the next most common emm-type in children [2].While we identified that 57.9 % of our invasive isolates were emm1, our dataset did not identify any invasive emm12 nor emm4 isolates, and the remaining 47.4 % (8/19) of invasive isolates during the 2022-2023 collection were of four other emm-types, potentially a reflection of regional variation in circulating strains within England.The skin prevalent emm76 did cause a substantial proportion of skin and iGAS infections suggesting we should not overlook this infection site.It is not known if the pronounced association of emm76 with skin infections was a local phenomenon as national data on skin infection emm-types were not collected.
Tissue tropism within S. pyogenes infections is well-recognized but bacterial molecular adaptations associated with tissue specialization remain incompletely understood [45].An acapsular genotype was frequently seen in throat isolates in this study, with just 51 % predicted to be able to produce capsule compared with 78 % of skin isolates.In throat isolates, these acapsular strains were predominantly 'generalist' pattern E. Just 27 % of pattern E isolates overall were predicted to be encapsulated and made up 44.5 % of throat infections.Interestingly, although more skin than throat infections were pattern E, at 56 %, a higher proportion (62.5 %) of these were encapsulated.We also identified evidence of evolving capsule loss with the emergence of a recent acapsular sub-lineage of emm75, a pattern E type, and the increase in the prevalence of acapsular emm22.
We further observed tissue-specific differences in predicted expression of the toxins NADase and SLO based on the promoter sequence, with 90 % of throat isolates predicted to express high levels of toxins compared with 56 % of skin isolates, highlighting a key role for toxin expression in the pathogenesis of S. pyogenes throat infections.We previously provided evidence of convergent evolution with acapsular strains gaining increased toxin expression through homologous recombination [31].Successful emergent lineages characterized by this recombination event have been identified previously in pattern E emm76, emm77 and emm87.Interestingly, while we continued to see expansion of these acapsular/high-toxin lineages in emm77 and emm87, in emm76, we observed instead an expansion of an ST378 encapsulated/low-toxin lineage to become a leading cause of skin infections.Indeed, overall, we observed significant differences between capsule/toxin-expression genotypes by infection site, with more throat isolates possessing an acapsular/high-toxin genotype and more skin isolates possessing an encapsulated/low-toxin genotype.This association was maintained even within the 'generalist' pattern E isolates when divided by isolate source, suggesting that differential capsule expression and NADase/SLO expression is a bacterial adaptation mechanism for tissue tropism.The emergence of lineages within pattern E emm-types 76, 77, 87 and 89 that have undergone genetic changes to become acapsular with high-toxin expression appear to be recent events [31]; the rise to dominance of the new acapsular/high-toxin variant of emm89 in the UK population occurred in 2007-2008 [33].Here, in 2022-2023 compared with 2016-2017, we observed an increase in the prevalence of acapsular/high-toxin expression genotype emm22 causing throat infections and the emergence of an acapsular emm75 lineage.
Why there has been a recent shift towards an acapsular/high toxin expression genotype and why this supports throat infections over skin infections is unclear.Although capsule has been shown to promote upper respiratory tract infections for some genotypes [46,47], long-term throat carriage isolates lose capsule expression through mutations in hasABC [48], potentially allowing for invasion into host cells that is otherwise hindered by capsule expression.Although why this would be preferential in the throat compared with the skin is also unclear.Capsule also provides protection against phagocytosis, and this has also been shown to promote both skin and throat infections [47]; however, this may be an emm-type specific phenomenon [49], and an increase in NADase/SLO expression may compensate for capsule loss [50].An association between NADase activity and tissue tropism has also been identified previously, with NADase-inactive strains being primarily 'skin-associated' emm-pattern D, suggesting toxin may be less essential for S. pyogenes infection in the skin ecological niche compared with in the throat [51], although this could also be related to capsule expression in this niche, rendering NADase activity non-essential.We are limited in our study by the fact that we are only looking at genotypes and predicting phenotype, and it is possible that predicted capsule or NADase/SLO expression may be different in vivo.Work is ongoing to determine these phenotypes and the association with tissue tropism.
The most frequently identified emm-type across both throat and skin infections in 2022-2023 was emm1; all were M1 UK lineage and distributed throughout the M1 UK phylogeny without evidence of local expansion of a specific sub-clade.While classically 'throat-associated' , it is likely the high number of emm1 cases in the skin reflected the high burden of throat disease and onwards transmission to the skin [52].In contrast, emm12 isolates from our collections in both 2022-2023 and 2016-2017 were distributed predominantly across two of the four clades: I and IV.Previous work has suggested an association in clades I-III between scarlet fever and ssa (encoding streptococcal superantigen A), and an absence of scarlet fever in clade IV; however, no such clear associations were seen in our collection [37].Geographical variation was apparent across our emm12 phylogeny, with the majority of Sheffield isolates found within a single sub-clade of clade IV and clade I, in a distribution distinct from that of USA isolates and again from Asian isolates.We observed similar geographical variation in emm87, with two of the four main clades dominated by USA strains and one clade containing the majority of Sheffield and other UK strains.These variations are in keeping with regional divergence of circulating lineages and can give rise to the emergence of geographically restricted sub-lineages, some of which have been seen to expand more widely if carrying a fitness advantage [7].
Comparison of isolates from 2022-2023 to 2016-2017 revealed differences in emm-type distribution between collections.More isolates from 2022-2023 were emm1 and emm12 compared with 2016-2017, across all infection sites.Within throat samples, 2022-2023 saw a fall in the number of emm89 and emm75 isolates compared with 2016-2017.Within skin site isolates, we saw more emm76 and emm49 in 2022-2023, and fewer emm43, emm89 and emm28 than 2016-2017.Where there is seasonality of S. pyogenes infection, as seen in Sheffield, seasonal fluctuation in emm-type distribution also frequently occurs.It is likely this is multifactorial, affected by antibiotic pressures, climate, patterns of social mixing and variations in concurrent circulating respiratory viruses.Population immunity also plays a key role in successful strain transmission; the strongest immunological protection is emm-type-specific, though infection with one emm-type can confer a degree of protection to other emm-types, thereby influencing the profile of circulating strains in subsequent seasons [53].Genomic changes in circulating strains also have the potential to enhance strain virulence, with such changes underpinning previous upsurges in S. pyogenes disease [7].We have demonstrated several emergent sub-lineages in our 2022-2023 isolates, including within emm22 and emm75, which is likely to further impact strain diversity between seasons.
Our study was limited by the lack of genomic data for local iGAS isolates; the addition of this data would allow a more detailed genomic comparison to better understand the interplay between non-invasive and invasive diseases.Our throat samples were deemed scarlet fever or not scarlet fever based on details provided by clinicians, which were often incomplete.Public health and media messaging may have also contributed to ascertainment bias.In weeks 49 and 50 of our study, our diagnostic laboratory received substantially more throat swabs compared with these same weeks in previous years, with high numbers of S. pyogenes identified, however, with a lower overall swab positivity.
Overall, we found that an increase in prevalence of emm1 and emm12 in non-invasive disease in 2022-2023 locally reflected the national increase of these emm-types in iGAS and scarlet fever cases.We have highlighted the need to study both throat and skin infection isolates as they can differ, and the potential for differential capsule expression and/or NADase and SLO toxin expression to drive tissue tropism.We also demonstrated that emm-types may not be represented by single lineages and these can change over time and by geographical location.Frequent monitoring with WGS is needed to determine how rapidly these changes occur, what factors influence these changes and how they might drive infection rates, including overspill from non-invasive disease into invasive and upsurges.

Fig. 1 .
Fig. 1.(a) Distribution of emm-types by collection year and clinical isolate type.(b) Distribution of emm-types within throat isolates by clinical presentation.Data presented as a percentage of (a) the total number of isolates (n=341 for 2022-2023, n=165 for 2016-2017) and (b) the total number of throat isolates (n=209 for 2022-2023, n=127 for 2016-2017) in each collection period.

Fig. 2 .
Fig. 2. Distribution of emm-types across time for throat and skin isolates collected in 2022-2023.Week numbers and dates are presented on the x-axis.The dashed line represents the introduction of the NHS England group A Streptococcus interim clinical guidance summary for case management, on the 9 th of December 2022.

Table 1 .
Clinical characteristics of isolates collected in 2022-2023 and 2016-
5-9-year-old children in 2022-2023 than in 2016-2017 (33.3 % vs 15.7 %).Overall, those aged 4, 5, 6 or 7 years of age had the highest rates of S. pyogenes infection of any age in 2022-2023.This may reflect accelerated exposure to infection associated with school attendance, in keeping with reduced immunity to S. pyogenes and common respiratory viruses in this age group following reduced exposure during the COVID-19 pandemic [2, 43].